% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils_pctiles_lookup_create.R
\name{pctiles_lookup_create}
\alias{pctiles_lookup_create}
\title{Utility to create lookup table of percentiles 0 to 100 and mean for each indicator by State or USA total}
\usage{
pctiles_lookup_create(
  x,
  zone.vector = NULL,
  zoneOverallName = "USA",
  wts = NULL,
  usecollapse = TRUE,
  type = 7
)
}
\arguments{
\item{x}{data.frame with numeric data. Each column will be examined to calculate
mean,   and percentiles, for each zone}

\item{zone.vector}{optional names of states or regions, for example. same length as wts, or rows in mydf}

\item{zoneOverallName}{optional. Default is USA.}

\item{wts}{leave as default since weighted percentiles of blockgroups are not used for EJScreen percentiles anymore}

\item{usecollapse}{logical, whether to use collapse::fquantile()
instead of Hmisc package wtd.quantile and stats pkg quantile,
to test before fully removing dependency on Hmisc and also speed it up.}

\item{type}{DO NOT CHANGE - moot for EJScreen/EJAM - SEE SOURCE CODE - Hmisc pkg wtd.quantile type "1/n" was used here in the past and possibly by EJScreen
(EJScreen no longer uses weighted percentiles so this is moot for the weighted case)
but collapse pkg fquantile is now used here to avoid Hmisc dependency
and fquantile type 4 seems to be the same as Hmisc type "1/n" but that has not been confirmed,
and this function by default uses fquantile type 1, the inverse of the ECDF however,
which seems simpler than using type 4 which does linear interpolation between points of the ECDF!
***  NEED TO CONFIRM IF THAT CREATES A TABLE DIFFERENT THAN WHAT EJSCREEN WOULD CREATE}
}
\description{
Utility to create lookup table of percentiles 0 to 100 and mean for each indicator by State or USA total
}
\details{
EJScreen assigns each indicator in each block group a percentile value via python script, using
\url{https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.percentileofscore.html}

\preformatted{
  The way the python function is used as of 2023 is that percentileofscore is 80% if
  80% of all indicator values (statewide or nationwide, depending on the type being calculated)
  are less than (NOT equal to) the indicator value
  in the specified block group (since kind="strict").
  The percentile recorded in the EJScreen dataset is the floor of that,
  meaning if the 81.9% of values are less than x, the percentile is reported as 81.
  The EJScreen python script used to create percentile lookup tables is in a file
  called cal_statepctile_0222.py and the key lines of code and functions it uses are

  pctile = math.floor(stats.percentileofscore(barray, indicatorscore, kind="strict"))

  binvalue = getBinvalue(pctile)

  and

  def getBinvalue(pct):

if pct is None:
  return 0
else:
    if pct >= 95:
    return 11
elif pct >= 90 and pct < 95:
  return 10
elif pct >= 80 and pct < 90:
  return 9
elif pct >= 70 and pct < 80:
  return 8
elif pct >= 60 and pct < 70:
  return 7
elif pct >= 50 and pct < 60:
  return 6
elif pct >= 40 and pct < 50:
  return 5
elif pct >= 30 and pct < 40:
  return 4
elif pct >= 20 and pct < 30:
  return 3
elif pct >= 10 and pct < 20:
  return 2
else:
  return 1
  }
}
\keyword{internal}
